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Abstract 

Background: Recently, genome-wide association studies identified a pleiotropic gene locus, ABO, as being 
significantly associated with hematological traits. To confirm the effects of ABO on hematological traits, we 
examined the link between the ABO locus and hematological traits in Korean population-based cohorts. 

Results: Six tagging SNPs for ABO were analyzed with regard to their effects on hematological traits [white blood 
cell count (WBC), red blood cell count (RBC), platelet (Plat), mean corpuscular volume (MCV), and mean corpuscular 
haemoglobin concentration (MCHC)]. Linear regression analyses were performed, controlling for recruitment 
center, sex, and age as covariates. Of the 6 tagging SNPs, 3 (rs2073823, rs81 76720, and rs495828) and 3 (rs2073823, 
rs8176717, and rs687289) were significantly associated with RBC and MCV, respectively (Bonferroni correction 
p-value criteria < 0.05/6 = 0.008). rs2073823 and a reported SNP (rs81 76746), as well as rs495828 and a reported SNP 
(rs651007), showed perfect linkage disequilibrium status (/^s = 0.99). Of the remaining 3 SNPs (rs81 76720, rs8176717 
and rs687289), rs8176717 generated an independent signal with moderate p-value (= 0.045) when it was adjusted 
for by rs2073823 (the most significant SNP). We also identified a copy number variation (CNV) that was tagged by 
the SNP rs8176717, the minor allele of which correlated with the deletion allele of CNV. Our haplotype analysis 
indicated that the haplotype that contained the CNV deletion was significantly associated with MCV 
(/3 ± se = 0.363 ± 0.1 1 8, p =2.09 x 1 0"^). 

Conclusions: Our findings confirm that ABO is one of the genetic factors that are associated with hematological 
traits in the Korean population. This result is notable, because GWASs fail to evaluate the link between a CNV and 
phenotype traits. 
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Background 

The ABO gene encodes isoforms for terminal glycosyl- 
transferases, which transfer N-acetylgalactosamine and 
galactose to a common precursor (H substance), and lies 
on chromosome 9q34.2, containing 7 exons [1]. Exon 7 
contains a domain that distinguishes between the A and 
B activities of the glycosyltransferase [2]. Several geno- 
mewide association studies (GWASs) have identified 
ABO as a candidate marker of the risk for coronary ar- 
tery disease (CAD) [3], in addition to established CAD 
markers (sE-selectin, sP-selectin, and s-ICAMl) [4-6]. 
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Hematological traits, such as red blood cell count 
(RBC), white blood cell count (WBC), platelet number 
(Plat), hemoglobin level (Hb), and hematocrit (Hct), are 
measured routinely to diagnose and monitor hemato- 
logic diseases and ascertain overall patient health. Recent 
GWASs on hematological traits have been reported for 
Caucasian [7], Japanese [8], and African- American [9] 
cohorts. These studies have identified more than 30 loci 
that carry common DNA polymorphisms that are linked 
to hematological traits. 

The pleiotropic gene ABO correlated significantly with 
hematological traits in a Japanese [8] and African- 
American study [9], 3 SNPs of which (rs8 176746, 
rs651007, rs495828) were reported in previous GWASs. 
rs8 176746 is a nonsynonymous SNP and a deterministic 
variant of the B-type blood group [10]. rs651007 and 
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rs495828 lie in the promoter region and are associated 
with CAD [4]. To confirm the effects of ABO on 
hematological traits, we examined the link between the 
ABO locus and hematological traits in Korean population- 
based cohorts. 

Results 

Hematological traits 

The population characteristics and mean hematological 
traits are described in Table 1. Six hematological traits 
[WBC, RBC, Hb, Hct, Plat, and mean corpuscular vol- 
ume (MCV)] were measured experimentally, and 2 
other traits [mean corpuscular haemoglobin (MCH) and 
mean corpuscular hemoglobin concentration (MCHC)] 
were calculated using the RBC, Hb, and Hct values. 
Among hematological traits, RBC correlated with Hb and 
Hct, with Pearsons r = 0.86 and 0.84, respectively. Also, 
MCV was linked to MCH, with Pearsons r = 0.81. WHR, 
Plat, and MCHC correlated moderately (r<0.7). Thus, we 
conducted a genetic association study of the ABO gene 
region with the 5 unrelated hematological traits. 

SNP selection 

SNPs in Affymetrix 5.0 SNP array and imputation SNP 
data were obtained from the Korean Genome Epide- 
miology Study (KoGES) of the National Institute of 
Health, Korea, and the genotype data were Korea 
Association Resource consortium (KARE) data. The gen- 
omewide SNPs have been examined in genomewide 
association studies for anthropometric [11] and bio- 
chemical traits [12]. In this study, we focused on the 
ABO region that was reported by a Japanese study. 

Table 1 Summary of participant characteristics and 
hematological traits 



Variables Mean ± standard deviation 



Sample size (n) 


6675 


Age (years) 


50.1 ±8.8 


Male (%) 


47.7 


Ansung (%) 


50.5 


WBC count (IOVmD 


6.4 ±1.9 


RBC count (IO^mD 


4.4 ±0.4 


Hb (g/dl) 


13.8±1.6 


Hct (%) 


41. 2 ±4.4 


Platelet count (IOVmD 


244.3 ±60.1 


MCV (fl) 


93.9 ±5.3 


MCH (pg) 


31. 2 ±2.0 


MCHC (g/dl) 


33.2 ±1.3 



Note. WBC: white blood cell, RBC: red blood cell, Hb: hemoglobin, Hct: 
hematocrit, MCV: mean corpuscular volume, MCH: mean corpuscular 
hemoglobin, MCHC: mean corpuscular hemoglobin concentration. 



Population stratification of the genotyped samples was 
also tested in an earlier report [11]; there was no popula- 
tion stratification that was demonstrated by Multidimen- 
sional Scaling (MDS) Analysis and Principal Component 
Analysis (PCA) (Additional file 1: Figure SI). Genomic in- 
flation factors were low ranging from 1.01 (WBC) to 1.03 
(Hct), suggesting that population stratification was well 
controlled (Additional file 2: Table SI) 

We initially used 76 SNPs around ABO on chromo- 
some 9 from 135,070 kbp to 135,152 kbp. The ABO 
gene boundaries were established by linkage disequi- 
librium (LD) analysis (Additional file 3: Figure S2). 
Three LD blocks encompassed ABO and its promoter 
region. The 3 LD blocks included 58 SNPs, 10 of 
which were genotyped by Aff)^metrix 5.0 SNP array; 
the remaining 48 SNPs were imputed by IMPUTE, 
based on the HAPMAP database. The characteristics 
of the 58 SNPs are described in Additional file 2: 
Table SI. The SNPs were classified as 8 nonsynonymous 
SNPs, 1 synonymous SNP, 8 upstream SNPs, and 41 
intron SNPs. 

ABO gene SNP association study 

For the association analysis, we isolated 6 tagging SNPs 
for ABO, In Additional file 4: Table S2, we describe the 6 
SNP groups with high LD (r^>0.9) and underlined the 
tagging SNPs. The association results are described in 
Table 2. In this study, we used Bonferroni correction p- 
value criteria (< 8.3 x 10'^) for multiple comparisons, 
and the significant effect sizes and p-values are under- 
lined in Table 2. Three SNPs (rs2073823, rs8 176720, 
and rs495828) and 3 SNPs (rs2073823, rs8176717, and 
rs687289) were significantly associated with RBC and 
MCV, respectively. 

To identif)^ independent association signals, we per- 
formed a conditional analysis by including rs2073823 
in the linear regression model of other significant 
SNP associations. For RBC, the association signal of 
rs8176720 disappeared (/?- value = 0.803), but that of 
rs495828 was significant (p-vdlue = 0.004) after adjusting 
for rs2073823. rs8176717 was moderately associated with 
MCV (/7-value = 0.045), but the association signal with 
rs687289 disappeared (/7-value =0.492). Thus, we identi- 
fied 3 independent associations (rs2073823, rs8176717 
and rs495828) between ABO and hematological traits. 

Identification of copy number variation 

A copy number variation (CNV) region was detected on 
chromosome 9, 135,120,477-135,122,527 (Figure 1), 
which includes the 3' untranslated region of the 
ABO gene. Because the array CGH experiment was 
conducted using a subset (n = 4694) of all KoGES 
samples, to maximize the sample size, we surveyed a 
tagging SNP that correlated well with CNV region 



Table 2 Association analysis of 6 high-LD-group tagging SNPs with five hematological traits by linear regression analysis, controlling for area, age, and sex as 
covariates 



CHR 


SNP 


BP 


A1 


WBC 




RBC 




Plat 




MCV 




MCHC 












BETA±SE 


P 


BETA±SE 


P 


BETA±SE 


P 


BETA±SE 


P 


BETA±SE 


P 


9 


rs2073823 


135122337 


A 


0.056 ±0.039 


0.15 


0.036 ±0.008 


2.13x10"^^ 


-1.213 ±1.253 


0.33 


-0.480 ±0.1 18 


5.06x10"^^ 


0.075 ± 0.029 


9.55x10"^ 


9 


rsB 176720 


135122694 


C 


0.043 ±0.032 


0.19 


0.01 8 ±0.006 


5.58x10"^ 


-0.232 ± 1 .033 


0.82 


-0.071 ±0.098 


0.47 


0.025 ± 0.024 


0.30 


9 


rs8176717 


135122855 


T 


0.001 ±0.039 


0.97 


-0.01 2 ±0.008 


0.13 


0.885 ± 1 .253 


0.48 


0.356 ±0.1 18 


2.62x10"^^ 


-0.035 ±0.029 


0.22 


9 


rs687289 


135126927 


A 


0.001 ±0.032 


0.97 


0.001 ±0.006 


0.89 


1.01 7 ±1.025 


0.32 


-0.274 ±0.097 


4.59x10"^ 


0.020 ±0.024 


0.40 


9 


rs8 176681 


135129575 


C 


0.01 6 ±0.035 


0.64 


0.009 ±0.007 


0.19 


-1.488 ±1.109 


0.18 


0.063 ±0.1 05 


0.55 


0.003 ± 0.026 


0.92 


9 


rs495828 


135144688 


T 


-0.050 ±0.037 


0.18 


-0.030 ±0.007 


2.69x10"^^ 


2.217±1.180 


0.06 


0.061 ±0.112 


0.58 


-0.042 ±0.027 


0.12 



Note. Chr. chromosome, BP: base pair, /\ 7: minor allele, WBC. white blood cell count, RBC. red blood cell count. Platelet, MCV: mean corpuscular volume, MCHC. mean corpuscular hemoglobin concentration. 
Underlined p-values were passed the multiple correction criteria of Bonferroni correction (p<0.008). 
^Significant result after adjusted by rs2073823. 
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Figure 1 Probe intensity of copy number variation region: Log^ ratio plot of the test sample and the reference (NA10851) 
signal intensity. 



genotypes. We determined the SNP rs8 176717 to correl- 
ate with the CNV region (r^ = 0.96), the minor allele of 
which (T allele) implied the minor allele (deletion allele) 
of CNV. 



Haplotype analysis 

We estimated the haplotyes for the 6 SNPs (Table 3). 
A total of 6 haplotypes were predicted, comprising 4 
common haplotypes and 2 rare haplotypes (frequen- 
cies < 0.05). A haplotype (Hap 3) included the minor 
allele of rs8176717, which tagged the CNV and was 
significantly associated with MCV (beta ± se = 0.363 ± 
0.118, /7-value = 2.09 X 10'^). The other significant 
haplotype was Hap 4, which was linked to RBC 
(beta ± se = 0.036 ± 0.008, p-value = 4.27 x 10"^) and 
MCV (beta ± se = -0.512 ± 0.119, p-value = 1.81 x 10'^). 



Discussion 

In this study, we confirmed the association between 
ABO and hematological traits in a large Korean popula- 
tion. Also, we found a copy number variation that influ- 
enced hematological traits. 

Of the 6 tagging SNPs in the ABO gene, rs2073823 
was the most significant, in perfect LD (r^ = 0.995) with 
rs8176746, an SNP from the Japanese GWAS on 
hematological traits [8]. The minor allele of rs8 176746 is 
the variant that encodes the B-type blood group. [10]. 
However, this SNP was not reported in a GWAS of 
hematological traits in Caucasians [7,9], possibly due to 
ethnic differences in the minor allele frequency in 
Caucasian (0.08), Chinese (0.23), and Japanese (0.17) 
individuals. The allele frequencies correspond well to the 
frequency of blood type B in Caucasian (-8%) and East 
Asian (-22%) individuals, as inferred from the 



Table 3 Haplotype frequencies and association results of six SNPs with red blood cell count (RBC) and mean 
corpuscular volume (MCV) 



Haplotypes 




Tagging SNPs for ABO gene region 




Haplotype 
frequency 


RBC 




MCV 






rs2073823 


rs81 76720 


rs8176717 


rs687289 


rs81 76681 


rs495828 


beta ± se 


P 


beta ± se 


P 


Hap 1 


G 


T 


G 


G 


C 


G 


0.297 


0.008 ±0.007 


0.249 


0.064 ±0.1 06 


0.544 


Hap 2 


G 


T 


G 


A 


T 


T 


0.261 


-0.030 ±0.007 


2.63x10"^ 


0.064 ±0.1 11 


0.563 


Hap 3 


G 


C 


T 


G 


T 


G 


0.214 


-0.01 3 ±0.008 


0.103 


0.363 ±0.1 18 


2.09x10"^ 


Hap 4 


A 


C 


G 


A 


T 


G 


0.208 


0.036 ±0.008 


4.27x10"^ 


-0.51 2 ±0.1 19 


1.81 xlO"^ 


Hap 5 


G 


T 


G 


G 


T 


G 


0.012 


NA 


NA 


NA 


NA 


Hap 6 


A 


C 


G 


A 


C 


G 


0.007 


NA 


NA 


NA 


NA 
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BLOODBOOK website (http://www.bloodbook.com/ 
world-abo.html). Using the minor allele frequency (0.008) 
and the mean RBC (± sd) = 4.82 ± 0.50 of Caucasians, we 
estimated the number of individuals required for the 80% 
power at the alpha = 5 x 10'^ (genome-wide significant 
levels) [7,9]. To be replicated the rs2073823 (LD with 
rs8 176746) association, 51,876 individuals would be 
necessary. However, the previous European study [7] used 
33,623 individuals which it was smaller than the estimated 
individual number at the genome-wide significant level. In 
our study, individuals with a minor allele of rs2073823 
had elevated RBC counts but decreased MCV. Thus, indi- 
viduals with the blood type B might have higher RBC 
counts and lower MCV than those with other blood types, 
at least among Asians. 

The second highest signal was generated from an 
upstream SNP, rs495828, which was also was reported 
in the Japanese GWAS [8]; this SNP was in perfect 
LD with rs651007, which was reported in an African- 
American GWAS [9]. Notably, the 3 proximal SNPs 
(rs651007, rs579459, and rs649129) were in com- 
plete LD (r^ = 0.99) with rs495828. Because carriers of 
the minor allele of these 3 SNPs have significantly 
lower levels of sP-selectin [5], sE-selectin [6], and risk 
of CAD [4], the relationship between hematological 
traits and coronary artery disease phenotypes should 
be examined. 

The Japanese GWAS reported complete LD between 
rs8176746 and rs495828. To confirm the LD, we esti- 
mated the LD in Europeans (r^ = 0.010 and D' = 0.150), 
Africans (r^ = 0.035 and D' = 1.000), Chinese, Japanese 
(r^ = 0.050 and D' = 1.000), and Koreans in this study 
(r^ = 0.087 and D' = 1.000). Even though it was 
reported that rs8 176746 and rs495828 are in complete 
LD in the Japanese study, the data from publically 
available databases suggests some inconsistencies with 
high D' and low r^. This suggests that rs495828 may 
represent an independent association signal for RBC. 
A limitation of our study is that the 2 most signifi- 
cant SNPs— rs8176746 and rs495828— were not geno- 
typed directly, although the minor allele frequencies 
of these SNPs are similar to those reported in the 
Japanese GWAS [2]. 

The CNV region that we identified has been 
reported by 7 other studies [13-18]. The minor allele 
of CNV was a deletion mutation of the 3' untrans- 
lated region of ABO; thus, the CNV might influence 
its expression. In our results, the haplotype included 
the minor allele of the CNV-tagging SNP (rs8176717) 
and was significant associated with MCV. This result 
is notable, because most GWASs do not evaluate the 
link between a CNV and phenotype traits. Thus, our 
study is a model that can be used to correlate SNPs 
and CNV. 



Conclusions 

ABO is one of the genetic factors that are associated 
with hematological traits in East Asian populations. 
Also, we identified a novel association with a SNP that 
tags a common CNV with MCV. This result is notable, 
because GWASs fail to evaluate the link between a CNV 
and phenotype traits. 

Methods 

Study participants 

This study was conducted as part of an ongoing 
population-based cohort of the Korean Genome and 
Epidemiology Study (KoGES). All participants were 
recruited from the cities of Ansung and Ansan in 
Gyeonggi-do Province, Korea. This study was approved 
by the Institutional Review Board of the Korea National 
Institute of Health, and all participants provided written 
informed consent for study participation. 

Hematological trait measures 

A total of 6675 samples were available for hematological 
trait analysis, as described in Table 1, Venous blood sam- 
ples were drawn from all participants into 4.5-ml tubes 
that contained K3-EDTA as an anticoagulant and were 
analyzed within 30 min to 4 h of collection. Hematological 
traits were measured by Seoul Clinical Laboratories Com- 
pany Ltd. The ADIVA 120 hematology system (Bayer 
Diagnostics, USA) was calibrated per the manufacturers 
guidelines. WBC count, RBC count, platelet count, Hb 
level, Hct, mean corpuscular volume (MCV), mean cor- 
puscular hemoglobin (MCH) level, and mean corpuscular 
hemoglobin concentration (MCHC) were determined 
automatically for all samples. 

SNP determination 

The ABO gene is located on chromosome 9 from 
135,120,384-135,140,451 bp. SNP genotypes were deter- 
mined using the Affj^metrix 5.0 SNP array, the experi- 
mental procedures of which are detailed elsewhere [11]. 
Further, to increase the number of genotype markers, 
we imputed additional SNPs using the Affymetrix 5.0 
SNP array and the HapMap database (HAPMAP 3, 
http://www.hapmap.org); the imputation methods have 
been described [19]. The final SNPs were selected using 
the following criteria: minor allele frequency > 0.1; 
missing rate < 10%; and Hardy- Weinberg equilibrium 
test j^-value > 0.05 for experimentally determined SNPs 
and imputation SNPs. Information on the SNPs was 
obtained from the dbSNP database (http://www.ncbi. 
nlm.gov/snp), and the genetic distance between the 
Korean and other populations was calculated using F- 
statistic [20]. LD blocks and pairwise LD (D' and r^) of 
SNPs were estimated and determined for the tagging 
SNPs in the ABO gene region using Haploview [21]. 
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CNV determination 

To identify regions of CNV, samples from 4694 partici- 
pants were genotyped using the NimbleGen HD2 2x720K 
array comparative genomic hybridization (aCGH) assay 
with DNA from peripheral blood. All samples passed 
experimental quality control metrics, such as the chro- 
mosome X shift and mad.ldr, as determined using 
NimbleScan version 2.5 per the manufacturer s guidelines. 
After quality control procedures, the signal intensity 
ratio between the test and reference sample (NA10851 
from the HapMap cell line DNA) of each probe was 
log2-transformed. 

Regions of CNV were identified using the Genome Al- 
teration Detection Analysis algorithm [22], which was 
used for samples from 4694 participants, with T = 10, 
alpha = 0.2, and MinSegLen = 10. The threshold for de- 
fining regions of CNV was set to an average log2 ratio 
of ±0.25 Additional file 5: Figure S3. 

CNV-tagging SNP 

We tagged SNPs to maximize the sample size. To find 
SNPs that tagged the identified CNVs well, we per- 
formed a correlation analysis that was similar to that in 
the Wellcome Trust Case Control Consortium CNV 
study [23] using calls that were identified in a GWAS 
with the Affymetrix 5.0 array [11]. For each CNV, we 
calculated the squared Pearson's r value between CNV 
regions and SNPs. We considered all SNPs within 1 Mb 
of the estimated 2 breakpoints (i.e., start and end points) 
of each CNV region. We selected the SNP with the 
highest r^ value for each CNV region. 

Association tests 

Linear regression analysis was used to analyze the asso- 
ciation between ABO SNPs or haplotypes of tagging 
SNPs and hematological trait, controlling for gender, age, 
and recruitment center as covariates. The asymptotic 
Hardy- Weinberg equilibrium test was conducted using 
PLINK (version 1.07) [24], and all reported p- values were 
two-sided (a = 0.05). Associations between SNPs and 
hematological traits were significant at p<= 8.3 x 10'^ 
after Bonferroni correction for multiple testing of 6 
SNPs. The sample size was estimated for rs2073823 asso- 
ciation in the European with the 80% statistical power at 
the genome-wide significance level by the QUANTO 
software (version 1.2.4, http://hydra.usc.edu/gxe/). 

Additional files 



Additional file 1: Figure SI. Multidimensional scaling (MDS) analysis 
and principal component analysis (PCA) [Cho et al., 2009]. 

Additional file 2: Table SI. Genomic inflation factor for hematological 
trait genome-wideassociation studies. 



Additional file 3: Figure S2. Linkage disequilibrium blocks of ABO 
gene region. 

Additional file 4: Table S2. SNP list of ABO gene region, minor allele 
frequency comparison, and genetic distance calculation between KARE 
and other populations. Underlined SNPs indicate the tagging SNPs for 
the ABO gene region used in the main paper. 

Additional file 5: Figure S3. CNV clustering results. We used CNV 
tools to summarize the signal intensity data and assign a [specific OR 
discrete] CNV genotype within the CNV region. (A) Histogram of the 
clustering procedure using data, transformed by the linear discriminant 
function (LDF). (B) Cluster plot of the CNV region predicted from the LDF 
signal. 
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